new sample
Explanation-based Data Augmentation for Image Classification
Existing works have generated explanations for deep neural network decisions toprovide insights into model behavior. Weobservethat these explanations can also be used to identify concepts that caused misclassifications. This allows us to understand the possible limitations of the dataset used to train the model, particularly the under-represented regions in the dataset.
Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition
Ahmed M. Alaa, Mihaela Van Der Schaar
We develop a Bayesian model for decision-making under time p ressure with endogenous information acquisition. In our model, the decisi on-maker decides when to observe (costly) information by sampling an underlying c ontinuous-time stochastic process (time series) that conveys informa tion about the potential occurrence/non-occurrence of an adverse event which will t erminate the decision-making process. In her attempt to predict the occurrence of t he adverse event, the decision-maker follows a policy that determines when to acquire information from the time series (continuation), and when to stop acquiring information and make a final prediction (stopping). We show that the optimal polic y has a " rendezvous" structure, i.e. a structure in which whenever a new informat ion sample is gathered from the time series, the optimal "date" for acquiring the ne xt sample becomes computable. The optimal interval between two information s amples balances a trade-off between the decision maker's "surprise", i.e. th e drift in her posterior belief after observing new information, and "suspense", i. e. the probability that the adverse event occurs in the time interval between two inf ormation samples. Moreover, we characterize the continuation and stopping re gions in the decision-maker's state-space, and show that they depend not only on th e decision-maker's beliefs, but also on the "context", i.e. the current realiza tion of the time series.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- Asia > Singapore (0.04)
- North America > United States > California (0.04)
OFAL: An Oracle-Free Active Learning Framework
Khorsand, Hadi, Pourahmadi, Vahid
In the active learning paradigm, using an oracle to label data has always been a complex and expensive task, and with the emersion of large unlabeled data pools, it would be highly beneficial If we could achieve better results without relying on an oracle. This research introduces OFAL, an oracle-free active learning scheme that utilizes neural network uncertainty. OFAL uses the model's own uncertainty to transform highly confident unlabeled samples into informative uncertain samples. First, we start with separating and quantifying different parts of uncertainty and introduce Monte Carlo Dropouts as an approximation of the Bayesian Neural Network model. Secondly, by adding a variational autoencoder, we go on to generate new uncertain samples by stepping toward the uncertain part of latent space starting from a confidence seed sample. By generating these new informative samples, we can perform active learning and enhance the model's accuracy. Lastly, we try to compare and integrate our method with other widely used active learning sampling methods.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Asia > Middle East > Iran (0.04)
Data Generation without Function Estimation
Daneshmand, Hadi, Soleymani, Ashkan
Estimating the score function (or other population-density-dependent functions) is a fundamental component of most generative models. However, such function estimation is computationally and statistically challenging. Can we avoid function estimation for data generation? We propose an estimation-free generative method: A set of points whose locations are deterministically updated with (inverse) gradient descent can transport a uniform distribution to arbitrary data distribution, in the mean field regime, without function estimation, training neural networks, and even noise injection. The proposed method is built upon recent advances in the physics of interacting particles. We show, both theoretically and experimentally, that these advances can be leveraged to develop novel generative methods.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Virginia (0.04)
Diffusion-based supervised learning of generative models for efficient sampling of multimodal distributions
Tran, Hoang, Zhang, Zezhong, Bao, Feng, Lu, Dan, Zhang, Guannan
We propose a hybrid generative model for efficient sampling of high-dimensional, multimodal probability distributions for Bayesian inference. Traditional Monte Carlo methods, such as the Metropolis-Hastings and Langevin Monte Carlo sampling methods, are effective for sampling from single-mode distributions in high-dimensional spaces. However, these methods struggle to produce samples with the correct proportions for each mode in multimodal distributions, especially for distributions with well separated modes. To address the challenges posed by multimodality, we adopt a divide-and-conquer strategy. We start by minimizing the energy function with initial guesses uniformly distributed within the prior domain to identify all the modes of the energy function. Then, we train a classifier to segment the domain corresponding to each mode. After the domain decomposition, we train a diffusion-model-assisted generative model for each identified mode within its support. Once each mode is characterized, we employ bridge sampling to estimate the normalizing constant, allowing us to directly adjust the ratios between the modes. Our numerical examples demonstrate that the proposed framework can effectively handle multimodal distributions with varying mode shapes in up to 100 dimensions. An application to Bayesian inverse problem for partial differential equations is also provided.
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- North America > United States > Florida > Leon County > Tallahassee (0.04)
- Workflow (0.95)
- Research Report (0.81)
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
- (2 more...)
STOOD-X methodology: using statistical nonparametric test for OOD Detection Large-Scale datasets enhanced with explainability
Sevillano-García, Iván, Luengo, Julián, Herrera, Francisco
Out-of-Distribution (OOD) detection is a critical task in machine learning, particularly in safety-sensitive applications where model failures can have serious consequences. However, current OOD detection methods often suffer from restrictive distributional assumptions, limited scalability, and a lack of interpretability. To address these challenges, we propose STOOD-X, a two-stage methodology that combines a Statistical nonparametric Test for OOD Detection with eXplainability enhancements. In the first stage, STOOD-X uses feature-space distances and a Wilcoxon-Mann-Whitney test to identify OOD samples without assuming a specific feature distribution. In the second stage, it generates user-friendly, concept-based visual explanations that reveal the features driving each decision, aligning with the BLUE XAI paradigm. Through extensive experiments on benchmark datasets and multiple architectures, STOOD-X achieves competitive performance against state-of-the-art post hoc OOD detectors, particularly in high-dimensional and complex settings. In addition, its explainability framework enables human oversight, bias detection, and model debugging, fostering trust and collaboration between humans and AI systems. The STOOD-X methodology therefore offers a robust, explainable, and scalable solution for real-world OOD detection tasks.
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- Europe > Sweden > Uppsala County > Uppsala (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
- Overview (1.00)
- Research Report > Promising Solution (0.46)
- Research Report > New Finding (0.46)
Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning
Ma, Yanbiao, Dai, Wei, Huang, Wenke, Chen, Jiayi
Data heterogeneity in federated learning, characterized by a significant misalignment between local and global distributions, leads to divergent local optimization directions and hinders global model training. Existing studies mainly focus on optimizing local updates or global aggregation, but these indirect approaches demonstrate instability when handling highly heterogeneous data distributions, especially in scenarios where label skew and domain skew coexist. To address this, we propose a geometry-guided data generation method that centers on simulating the global embedding distribution locally. We first introduce the concept of the geometric shape of an embedding distribution and then address the challenge of obtaining global geometric shapes under privacy constraints. Subsequently, we propose GGEUR, which leverages global geometric shapes to guide the generation of new samples, enabling a closer approximation to the ideal global distribution. In single-domain scenarios, we augment samples based on global geometric shapes to enhance model generalization; in multi-domain scenarios, we further employ class prototypes to simulate the global distribution across domains. Extensive experimental results demonstrate that our method significantly enhances the performance of existing approaches in handling highly heterogeneous data, including scenarios with label skew, domain skew, and their coexistence. Code published at: https://github.com/WeiDai-David/2025CVPR_GGEUR
- North America > United States > Virginia (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)